Intro

Who can live without music? The answer is no one. And there’s a famous saying about songs, that we are what we listen to. That means the type of songs you listen to may reflect what personality you have and what you may have been going through.

Today, we’re gonna look at the songs we listen to everyday from a bigger scope and analyze them in details where the data will tell us the story. Our analysis is based on two dimensions –genre and time, including details:

  • Song level analysis:
    • Number of songs since 1970s
    • Distributions of songs from different genres
    • Proportions of different genres throughout past decades
  • Lyric level analysis:
    • Overall word cloud of all songs
    • Popular words in differnt genres (Word cloud per genre)
    • Lyric sentiment by genres
    • Popular words through decades
    • Popular words through years

Data description:

  • Data source: The data set is a filtered corpus of 380,000+ song lyrics from from MetroLyrics. Structure is artist/year/song.
  • Text processing: We cleaned the text by converting all the letters to the lower case, and removing punctuation, numbers, empty words and extra white space.We reduced the words to their word stem and then convert the “tm” object to a “tidy” object for much faster processing. By doing so we managed to keep our processed words resemble the structure of the original lyrics and make them more representitive at the same time.
  • To assist our analysis, we generated new variables “nword” which calculates the number of words in each song lyrics and “decade” which is a new class of time period.

1. Song level analysis

Alright, now let’s take a look at the songs and see what happened to the song population in the past 46 years.

According to the bar chart, it is obvious that the number of songs increases as time go by. The music industry became much more productive since the new millennium. Also, there is a huge song population boom in the year 2006 and 2007, taking up almost 50 percent of the total amount. Readers interested can dig in to find out what happened that year causing such significant increase.

How is the song populations like in differnt genres?

Clearly, the Rock music takes up most of the positions in the music world over 50 percent, following by Pop, Metal, and country music.

Did different genres evolve throughout the decades?

We can see that the Rock music sticks at the top in the past decades. However, since the 90s, Pop music starts to grow really fast and become the second most popular genre in nowadays. What’s worth mentioning, the metal music was prosperous in the first decade of the new millennium, which may reflect some culture phenomenon behind this.

2. Lyrics level analysis

After we get a general idea of music genres throughout the years, now lets dig into the lyrics level.

First, let’s generate an overall word cloud of all genres:

Seems “You’re”, “love”, “time”, “baby” are the most popular words in all songs. “You’re love, you’re baby through life time.” Is that what the singers want to tell us most, haha.

How many words are there in songs from different genres?

Most certainly, Hip-Hop songs have the most words, over 500 hundred per song on averge while the rest are pretty much the same. In the boxplot, we can see that even though Hip-Hop song has the highest mean, the maximum of words comes from Rock music, over 6000 words! And the data in this group is highly skewed as well.

Now, let’s take a look at the sentiments of lyrics of different genres

We can see that Hip-Hop and Metal have more negative words than positive ones while electronic music is the oppsite.

## Joining, by = "word"

At the end

From this study, we find out several fun facts about modern songs and lyrics.

Firstly, stepping into the new millennium, the music industy becomes more productive. Especially in the year 2006 and 2007, there is a song population bomb. Among all genres, the Rock music takes up the most positions in the music world, while the Pop music starts to prosperous since 90s and becomes the second most popular genre nowadays.

Then, in the lyrics level, we realize that normally Hip-Hop songs have significantly more words than the others and the word count distribution of Rock music is highly skewed indicating that Rock music has very flexible styles.

Last but not least, “love” is the most popular and last-forever word or topic in the music world. However, it never lacks of diversity in the world: Metal music focus more on spiritual topics while the Hip-Hop music probably have some “bad language” issues.